Computing Temporal Trends in Web Documents
نویسنده
چکیده
Most existing methods of web content mining assume a static nature of the web documents. This approach is inadequate for long-term monitoring and analysis of the web content, since both the users' interests and the content of most web sites are subject to continuous changes over time. In this research, we are interested in developing computationally intelligent and efficient text mining techniques that will enable continuous comparison between documents provided by the same source (website, institute, organization, cult, author etc.) or viewed by the same group of users (e.g., university students) and timely detection of temporal trends in those documents. Our approach builds upon the recently developed methodology for fuzzy comparison of frequency distributions. The proposed techniques are evaluated on a real-world stream of web traffic.
منابع مشابه
Introducing New Trends for Persian CAPTCHA
To distinguish between human user and computer program to enhance security, a popular test called CAPTCHA is used on Web. CAPTCHA has an important role in preventing Denial Of Service (DOS) attacks in computer networks. There are many different types of CAPTCHA in different languages. Due to the expansion of Persian-language and documents on internet, creating a suitable Persian CAPTCHA seems t...
متن کاملTemporal ranking for fresh information retrieval
In business, the retrieval of up-to-date, or fresh, information is very important. It is difficult for conventional search engines based on a centralized architecture to retrieve fresh information, because they take a long time to collect documents via Web robots. In contrast to a centralized architecture, a search engine based on a distributed architecture does not need to collect documents, b...
متن کاملThe design, implementation, and performance of the V2 temporal document database system
It is now feasible to store previous versions of documents, and not only the most recent version which has been the traditional approach. This is of interest in a number of application, both temporal document databases as well as web archiving systems and temporal XML warehouses. In this paper, we describe describe the architecture and the implementation of V2, a temporal document database syst...
متن کاملConcepts of Bitemporal Database Theory and the Evolution of Web Documents
A vast amount of temporal information is provided on the web. Even though many facts expressed in documents are time-related, the temporal properties of web presentations have not received much attention. In database research, temporal databases have become a mainstream topic in recent years. In web documents temporal data may exist as meta data in the header and as user-directed data in the bo...
متن کاملAnalyzing the Collaboration Network of Global Scientific Outputs in the Field of Bibliotherapy in the Web of Science Database
Background and Aim: Bibliotherapy is a useful treatment for the prevention and treatment of mental disorders and has led to the formation of many scientific publications in this field. The purpose of this study was to investigate the publication trends in the field of bibliotherapy and visualize the structure of its scientific collaborations based on the Web of Science database during the perio...
متن کامل